Building Machine Learning Powered Applications: Going from Idea to Product

Building Machine Learning Powered Applications: Going from Idea to Product

作者: Ameisen Emmanuel
出版社: O'Reilly
出版在: 2020-02-11
ISBN-13: 9781492045113
ISBN-10: 149204511X
裝訂格式: Quality Paper - also called trade paper
總頁數: 256 頁





內容描述


Learn the skills necessary to design, build, and deploy applications powered by machine learning. Through the course of this hands-on book, you'll build an example ML-driven application from initial idea to deployed product. Data scientists, software engineers, and product managers with little or no ML experience will learn the tools, best practices, and challenges involved in building a real-world ML application step-by-step.
Author Emmanuel Ameisen, who worked as a data scientist at Zipcar and led Insight Data Science's AI program, demonstrates key ML concepts with code snippets, illustrations, and screenshots from the book's example application.
The first part of this guide shows you how to plan and measure success for an ML application. Part II shows you how to build a working ML model, and Part III explains how to improve the model until it fulfills your original vision. Part IV covers deployment and monitoring strategies.
This book will help you:

Determine your product goal and set up a machine learning problem
Build your first end-to-end pipeline quickly and acquire an initial dataset
Train and evaluate your ML model and address performance bottlenecks
Deploy and monitor models in a production environment


目錄大綱


How to Contact Us
Acknowledgments
I. Find the Correct ML Approach

  1. From Product Goal to ML Framing
    Estimate What Is Possible
    Models
    Data
    Framing the ML Editor
    Trying to Do It All with ML: An End-to-End Framework
    The Simplest Approach: Being the Algorithm
    Middle Ground: Learning from Our Experience
    Monica Rogati: How to Choose and Prioritize ML Projects
    Conclusion
  2. Create a Plan
    Measuring Success
    Business Performance
    Model Performance
    Freshness and Distribution Shift
    Speed
    Estimate Scope and Challenges
    Leverage Domain Expertise
    Stand on the Shoulders of Giants
    ML Editor Planning
    Initial Plan for an Editor
    Always Start with a Simple Model
    To Make Regular Progress: Start Simple
    Start with a Simple Pipeline
    Pipeline for the ML Editor
    Conclusion
    II. Build a Working Pipeline
  3. Build Your First End-to-End Pipeline
    The Simplest Scaffolding
    Prototype of an ML Editor
    Parse and Clean Data
    Tokenizing Text
    Generating Features
    Test Your Workflow
    User Experience
    Modeling Results
    ML Editor Prototype Evaluation
    Model
    User Experience
    Conclusion
  4. Acquire an Initial Dataset
    Iterate on Datasets
    Do Data Science
    Explore Your First Dataset
    Be Efficient, Start Small
    Insights Versus Products
    A Data Quality Rubric
    Label to Find Data Trends
    Summary Statistics
    Explore and Label Efficiently
    Be the Algorithm
    Data Trends
    Let Data Inform Features and Models
    Build Features Out of Patterns
    ML Editor Features
    Robert Munro: How Do You Find, Label, and Leverage Data?
    Conclusion
    III. Iterate on Models
  5. Train and Evaluate Your Model
    The Simplest Appropriate Model
    Simple Models
    From Patterns to Models
    Split Your Dataset
    ML Editor Data Split
    Judge Performance
    Evaluate Your Model: Look Beyond Accuracy
    Contrast Data and Predictions
    Confusion Matrix
    ROC Curve
    Calibration Curve
    Dimensionality Reduction for Errors
    The Top-k Method
    Other Models
    Evaluate Feature Importance
    Directly from a Classifier
    Black-Box Explainers
    Conclusion
  6. Debug Your ML Problems
    Software Best Practices
    ML-Specific Best Practices
    Debug Wiring: Visualizing and Testing
    Start with One Example
    Test Your ML Code
    Debug Training: Make Your Model Learn
    Task Difficulty
    Optimization Problems
    Debug Generalization: Make Your Model Useful
    Data Leakage
    Overfitting
    Consider the Task at Hand
    Conclusion
  7. Using Classifiers for Writing Recommendations
    Extracting Recommendations from Models
    What Can We Achieve Without a Model?
    Extracting Global Feature Importance
    Using a Model’s Score
    Extracting Local Feature Importance
    Comparing Models
    Version 1: The Report Card
    Version 2: More Powerful, More Unclear
    Version 3: Understandable Recommendations
    Generating Editing Recommendations
    Conclusion
    IV. Deploy and Monitor
  8. Considerations When Deploying Models
    Data Concerns
    Data Ownership
    Data Bias
    Systemic Bias
    Modeling Concerns
    Feedback Loops
    Inclusive Model Performance
    Considering Context
    Adversaries
    Abuse Concerns and Dual-Use
    Chris Harland: Shipping Experiments
    Conclusion
  9. Choose Your Deployment Option
    Server-Side Deployment
    Streaming Application or API
    Batch Predictions
    Client-Side Deployment
    On Device
    Browser Side
    Federated Learning: A Hybrid Approach
    Conclusion
  10. Build Safeguards for Models
    Engineer Around Failures
    Input and Output Checks
    Model Failure Fallbacks
    Engineer for Performance
    Scale to Multiple Users
    Model and Data Life Cycle Management
    Data Processing and DAGs
    Ask for Feedback
    Chris Moody: Empowering Data Scientists to Deploy Models
    Conclusion
  11. Monitor and Update Models
    Monitoring Saves Lives
    Monitoring to Inform Refresh Rate
    Monitor to Detect Abuse
    Choose What to Monitor
    Performance Metrics
    Business Metrics
    CI/CD for ML
    A/B Testing and Experimentation
    Other Approaches
    Conclusion
    Index

作者介紹


Emmanuel Ameisen has worked for years as a Data Scientist. He implemented and deployed predictive analytics and machine learning solutions for Local Motion and Zipcar. Recently, Emmanuel has led Insight Data Science's AI program where he oversaw more than a hundred machine learning projects. Emmanuel holds graduate degrees in artificial intelligence, computer engineering, and management from three of France's top schools.




相關書籍

Java Data Mining: Strategy, Standard, and Practice: A Practical Guide for architecture, design, and implementation

作者 Mark F. Hornick Erik Marcadé Sunil Venkayala

2020-02-11

多變量分析在社會科學領域之應用─SPSS 操作與資料分析

作者 林曉芳

2020-02-11

智能硬件與機器視覺:基於樹莓派、Python 和 OpenCV

作者 陳佳林

2020-02-11